71 research outputs found
Region-Based Image Retrieval Revisited
Region-based image retrieval (RBIR) technique is revisited. In early attempts
at RBIR in the late 90s, researchers found many ways to specify region-based
queries and spatial relationships; however, the way to characterize the
regions, such as by using color histograms, were very poor at that time. Here,
we revisit RBIR by incorporating semantic specification of objects and
intuitive specification of spatial relationships. Our contributions are the
following. First, to support multiple aspects of semantic object specification
(category, instance, and attribute), we propose a multitask CNN feature that
allows us to use deep learning technique and to jointly handle multi-aspect
object specification. Second, to help users specify spatial relationships among
objects in an intuitive way, we propose recommendation techniques of spatial
relationships. In particular, by mining the search results, a system can
recommend feasible spatial relationships among the objects. The system also can
recommend likely spatial relationships by assigned object category names based
on language prior. Moreover, object-level inverted indexing supports very fast
shortlist generation, and re-ranking based on spatial constraints provides
users with instant RBIR experiences.Comment: To appear in ACM Multimedia 2017 (Oral
Rethinking Adversarial Training with A Simple Baseline
We report competitive results on RobustBench for CIFAR and SVHN using a
simple yet effective baseline approach. Our approach involves a training
protocol that integrates rescaled square loss, cyclic learning rates, and
erasing-based data augmentation. The outcomes we have achieved are comparable
to those of the model trained with state-of-the-art techniques, which is
currently the predominant choice for adversarial training. Our baseline,
referred to as SimpleAT, yields three novel empirical insights. (i) By
switching to square loss, the accuracy is comparable to that obtained by using
both de-facto training protocol plus data augmentation. (ii) One cyclic
learning rate is a good scheduler, which can effectively reduce the risk of
robust overfitting. (iii) Employing rescaled square loss during model training
can yield a favorable balance between adversarial and natural accuracy. In
general, our experimental results show that SimpleAT effectively mitigates
robust overfitting and consistently achieves the best performance at the end of
training. For example, on CIFAR-10 with ResNet-18, SimpleAT achieves
approximately 52% adversarial accuracy against the current strong AutoAttack.
Furthermore, SimpleAT exhibits robust performance on various image corruptions,
including those commonly found in CIFAR-10-C dataset. Finally, we assess the
effectiveness of these insights through two techniques: bias-variance analysis
and logit penalty methods. Our findings demonstrate that all of these simple
techniques are capable of reducing the variance of model predictions, which is
regarded as the primary contributor to robust overfitting. In addition, our
analysis also uncovers connections with various advanced state-of-the-art
methods.Comment: 19 pages, 8 figures, 6 table
Towards robots reasoning about group behavior of museum visitors: leader detection and group tracking
The final publication is available at IOS Press through http://dx.doi.org/10.3233/AIS-170467Peer ReviewedPostprint (author's final draft
Multiple Object Tracking from appearance by hierarchically clustering tracklets
Current approaches in Multiple Object Tracking (MOT) rely on the
spatio-temporal coherence between detections combined with object appearance to
match objects from consecutive frames. In this work, we explore MOT using
object appearances as the main source of association between objects in a
video, using spatial and temporal priors as weighting factors. We form initial
tracklets by leveraging on the idea that instances of an object that are close
in time should be similar in appearance, and build the final object tracks by
fusing the tracklets in a hierarchical fashion. We conduct extensive
experiments that show the effectiveness of our method over three different MOT
benchmarks, MOT17, MOT20, and DanceTrack, being competitive in MOT17 and MOT20
and establishing state-of-the-art results in DanceTrack.Comment: To be published in BMVC 202
Context-based conceptual image indexing
International audienceAutomatic semantic classification of image databases is very useful for users searching and browsing, but it is at the same time a very challenging research problem as well. Local features based image classification is one of the key issues to bridge the semantic gap in order to detect concepts. This paper proposes a framework for incorporating contextual information into the concept detection process. The proposed method combines local and global classifiers with stacking, using SVM.We studied the impact of topologic and semantic contexts in concept detection performance and proposed solutions to handle the large amount of dimensions involved in classified data. We conducted experiments on TRECVID�04 subset with 48104 images and 5 concepts. We found that the use of context yields a significant improvement both for the topologic and semantic contexts
- …